AITopics

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.52)

Neural Information Processing SystemsDec-26-2025, 02:33:17 GMT

Towards Interpretable Reinforcement Learning Using Attention Augmented Agents

Inspired by recent work in attention models for image captioning and question answering, we present a soft attention model for the reinforcement learning domain. This model bottlenecks the view of an agent by a soft, top-down attention mechanism, forcing the agent to focus on task-relevant information by sequentially querying its view of the environment. The output of the attention mechanism allows direct observation of the information used by the agent to select its actions, enabling easier interpretation of this model than of traditional models. We analyze the different strategies the agents learn and show that a handful of strategies arise repeatedly across different games. We also show that the model learns to query separately about space and content ( what''). We demonstrate that an agent using this mechanism can achieve performance competitive with state-of-the-art models on ATARI tasks while still being interpretable.

attention augmented agent, interpretable reinforcement learning, name change, (5 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.31)

Neural Information Processing SystemsAug-20-2025, 07:55:10 GMT

Towards Interpretable Reinforcement Learning Using Attention Augmented Agents

This can uncover some of the "attended"

agent, attention map, information, (15 more...)

Country:

North America > Canada (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

arXiv.org Artificial IntelligenceJul-18-2025

A Survey of Explainable Reinforcement Learning: Targets, Methods and Needs

Saulières, Léo

The success of recent Artificial Intelligence (AI) models has been accompanied by the opacity of their internal mechanisms, due notably to the use of deep neural networks. In order to understand these internal mechanisms and explain the output of these AI models, a set of methods have been proposed, grouped under the domain of eXplainable AI (XAI). This paper focuses on a sub-domain of XAI, called eXplainable Reinforcement Learning (XRL), which aims to explain the actions of an agent that has learned by reinforcement learning. We propose an intuitive taxonomy based on two questions "What" and "How". The first question focuses on the target that the method explains, while the second relates to the way the explanation is provided. We use this taxonomy to provide a state-of-the-art review of over 250 papers. In addition, we present a set of domains close to XRL, which we believe should get attention from the community. Finally, we identify some needs for the field of XRL.

artificial intelligence, machine learning, natural language, (15 more...)

2507.12599

Country:

North America > United States > California (1.00)
Europe > France (0.92)
Europe > United Kingdom (0.92)
(5 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)

Industry:

Health & Medicine (1.00)
Energy (1.00)
Education (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Singh, Kamal, Marouani, Sami, Sheikh, Ahmad Al, Quang, Pham Tran Anh, Habrard, Amaury

Interpretable Reinforcement Learning for Load Balancing using Kolmogorov-Arnold Networks

arXiv.org Artificial IntelligenceMay-21-2025

As load and delta load increase, the policy puts more flows on the Internet link. Increasing Internet delay puts the flows on MPLS. The contribution of Internet loss seems counter intuitive as it seems to put more load on Internet Link. However, even if its coefficient is near to 1.0, the overall contribution of the term is negligible as compared to load because loss in our scenario varies from 0 to around 0.15. This applies to delay too. For minimising loss, we extract the following: a 1. 9 1 .1( 2 λ 3 + 1) 2 2λ i 5 + 10 d i 3 + u i 10 (4) This policy can be interpreted as follows, and we may refer to Figure 1 as well. The ratio starts near 0.8 and increasing load, with increasing delta, puts more traffic on Internet link. Increasing Internet delay and Internet link utilisation slightly shifts the balance towards putting more traffic on MPLS link. Distillation of symbolic equations of PPO policy: In this method, we train policy using PPO, generate trajectory data and then generate the symbolic equations using auto-regressive models [22].

machine learning, reinforcement learning, traffic, (18 more...)

2505.14459

Genre: Research Report (0.40)

Industry: Energy > Power Industry (0.43)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsJan-27-2025, 20:36:01 GMT

Reviews: Towards Interpretable Reinforcement Learning Using Attention Augmented Agents

The paper is well-written and clear; the architecture is described in detail through a diagram (Figure 1 on page 2), with the math in section 2 expanding on the key components of the attention mechanism. High-level details for the RL training setup, implemented baselines, and condensed results are provided in the body of the paper. Detailed learning curves for each of the compared approaches are presented in the appendix (which is appropriate, given that the task-specific learning performance is secondary to the analysis of the attention mechanism). The analysis section is thorough, and I specifically appreciated the section at the end comparing the learned attention mechanism to prior work on saliency maps. Model/Architecture Notes: While the proposed model is a straightforward extension of query-key-value attention to tasks in RL, there are two interesting architectural features: First, "queries" for their attention mechanism can be decomposed into features that act on content (which the paper refers to as the "what"), and features that act on spatial location (which the paper refers to as the "where").

attention augmented agent, attention mechanism, interpretable reinforcement learning, (11 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Neural Information Processing SystemsJan-27-2025, 20:35:51 GMT

Reviews: Towards Interpretable Reinforcement Learning Using Attention Augmented Agents

The reviewers appreciated the analysis of the learned attention from the model, and recommend accepting. I would ask that the authors add some of the experiments suggested by R4 in their "key questions" section. If there is not enough space, please add these to the supplemental.

attention augmented agent, interpretable reinforcement learning, review

Genre: Research Report (0.96)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Neural Information Processing SystemsOct-11-2024, 04:37:54 GMT

Towards Interpretable Reinforcement Learning Using Attention Augmented Agents

attention augmented agent, attention model, interpretable reinforcement learning, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Rietz, Finn, Schaffernicht, Erik, Heinrich, Stefan, Stork, Johannes A.

Towards Interpretable Reinforcement Learning with Constrained Normalizing Flow Policies

arXiv.org Artificial IntelligenceMay-2-2024

Reinforcement learning policies are typically represented by black-box neural networks, which are non-interpretable and not well-suited for safety-critical domains. To address both of these issues, we propose constrained normalizing flow policies as interpretable and safe-by-construction policy models. We achieve safety for reinforcement learning problems with instantaneous safety constraints, for which we can exploit domain knowledge by analytically constructing a normalizing flow that ensures constraint satisfaction. The normalizing flow corresponds to an interpretable sequence of transformations on action samples, each ensuring alignment with respect to a particular constraint. Our experiments reveal benefits beyond interpretability in an easier learning objective and maintained constraint satisfaction throughout the entire learning process. Our approach leverages constraints over reward engineering while offering enhanced interpretability, safety, and direct means of providing domain knowledge to the agent without relying on complex reward functions.

constrained normalizing flow policy, interpretable reinforcement learning

2405.01198

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.80)

Kohler, Hector, Delfosse, Quentin, Festor, Paul, Preux, Philippe

Towards a Research Community in Interpretable Reinforcement Learning: the InterpPol Workshop

arXiv.org Artificial IntelligenceApr-16-2024

Embracing the pursuit of intrinsically explainable reinforcement learning raises crucial questions: what distinguishes explainability from interpretability? Should explainable and interpretable agents be developed outside of domains where transparency is imperative? What advantages do interpretable policies offer over neural networks? How can we rigorously define and measure interpretability in policies, without user studies? What reinforcement learning paradigms,are the most suited to develop interpretable agents? Can Markov Decision Processes integrate interpretable state representations? In addition to motivate an Interpretable RL community centered around the aforementioned questions, we propose the first venue dedicated to Interpretable RL: the InterpPol Workshop.

interpretable policy, reinforcement, reinforcement learning, (11 more...)

2404.10906

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.05)
Europe > France > Hauts-de-France > Nord > Lille (0.05)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre:

Research Report (0.50)
Overview (0.47)

Industry: Education (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)